-
-
Notifications
You must be signed in to change notification settings - Fork 779
Only start an attempt if not finished. Send message to worker if pending executing #2377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
WalkthroughThe PR tightens run lifecycle status checks. In runAttemptSystem.ts, startRunAttempt now prevents starting attempts when the latest snapshot executionStatus is finished or pending-finished by using isFinishedOrPendingFinished. cancelRun now treats runs with executionStatus executing or pending-executing (via isPendingExecuting) as eligible to transition to PENDING_CANCEL and notify the worker. In execution.ts, when a snapshot executionStatus is FINISHED the shutdown reason passed is changed to "already-finished" instead of "re-queued". No public/exported signatures changed. Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
internal-packages/run-engine/src/engine/systems/runAttemptSystem.ts (2)
353-355
: Block starting attempts when snapshot is finished or cancelling — good; refine error detailsThe broadened guard prevents attempts when the latest snapshot is FINISHED or PENDING_CANCEL, which matches the intent. Nit: the error reads like only “finished,” and 409 (Conflict) better reflects state conflicts than 400.
Apply this diff to improve clarity:
- if (isFinishedOrPendingFinished(latestSnapshot.executionStatus)) { - throw new ServiceValidationError("Task run is already finished", 400); - } + if (isFinishedOrPendingFinished(latestSnapshot.executionStatus)) { + throw new ServiceValidationError("Task run is finished or cancelling", 409); + }
1329-1332
: Notify worker when run is pending executing — matches intent; optional helper for readabilityIncluding PENDING_EXECUTING in the “notify worker and set PENDING_CANCEL” path is correct and aligns with the PR goal. As a minor readability tweak, consider extracting a small helper (e.g., isExecutingOrPendingExecuting) to centralize this combined check.
If desirable, add to statuses.ts:
- export function isExecutingOrPendingExecuting(status): boolean { return isExecuting(status) || isPendingExecuting(status); }
Then replace the condition here accordingly.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
internal-packages/run-engine/src/engine/systems/runAttemptSystem.ts
(3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{ts,tsx}
📄 CodeRabbit Inference Engine (.github/copilot-instructions.md)
**/*.{ts,tsx}
: Always prefer using isomorphic code like fetch, ReadableStream, etc. instead of Node.js specific code
For TypeScript, we usually use types over interfaces
Avoid enums
No default exports, use function declarations
Files:
internal-packages/run-engine/src/engine/systems/runAttemptSystem.ts
🧠 Learnings (8)
📓 Common learnings
Learnt from: matt-aitken
PR: triggerdotdev/trigger.dev#2264
File: apps/webapp/app/services/runsRepository.server.ts:172-174
Timestamp: 2025-07-12T18:06:04.133Z
Learning: In apps/webapp/app/services/runsRepository.server.ts, the in-memory status filtering after fetching runs from Prisma is intentionally used as a workaround for ClickHouse data delays. This approach is acceptable because the result set is limited to a maximum of 100 runs due to pagination, making the performance impact negligible.
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/webapp.mdc:0-0
Timestamp: 2025-07-18T17:49:47.180Z
Learning: Do not use or add new code to the legacy run engine; focus on using and migrating to Run Engine 2.0 in `internal/run-engine`.
📚 Learning: 2024-10-18T15:41:52.352Z
Learnt from: nicktrn
PR: triggerdotdev/trigger.dev#1418
File: packages/core/src/v3/errors.ts:364-371
Timestamp: 2024-10-18T15:41:52.352Z
Learning: In `packages/core/src/v3/errors.ts`, within the `taskRunErrorEnhancer` function, `error.message` is always defined, so it's safe to directly call `error.message.includes("SIGTERM")` without additional checks.
Applied to files:
internal-packages/run-engine/src/engine/systems/runAttemptSystem.ts
📚 Learning: 2025-07-18T17:49:47.180Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/webapp.mdc:0-0
Timestamp: 2025-07-18T17:49:47.180Z
Learning: Do not use or add new code to the legacy run engine; focus on using and migrating to Run Engine 2.0 in `internal/run-engine`.
Applied to files:
internal-packages/run-engine/src/engine/systems/runAttemptSystem.ts
📚 Learning: 2025-07-12T18:06:04.133Z
Learnt from: matt-aitken
PR: triggerdotdev/trigger.dev#2264
File: apps/webapp/app/services/runsRepository.server.ts:172-174
Timestamp: 2025-07-12T18:06:04.133Z
Learning: In apps/webapp/app/services/runsRepository.server.ts, the in-memory status filtering after fetching runs from Prisma is intentionally used as a workaround for ClickHouse data delays. This approach is acceptable because the result set is limited to a maximum of 100 runs due to pagination, making the performance impact negligible.
Applied to files:
internal-packages/run-engine/src/engine/systems/runAttemptSystem.ts
📚 Learning: 2025-07-18T17:50:25.014Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-07-18T17:50:25.014Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : When triggering a task from inside another task, use `yourTask.trigger`, `yourTask.batchTrigger`, `yourTask.triggerAndWait`, `yourTask.batchTriggerAndWait`, `batch.triggerAndWait`, `batch.triggerByTask`, or `batch.triggerByTaskAndWait` as shown.
Applied to files:
internal-packages/run-engine/src/engine/systems/runAttemptSystem.ts
📚 Learning: 2025-07-18T17:50:25.014Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-07-18T17:50:25.014Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : When triggering a task from backend code, use `tasks.trigger`, `tasks.batchTrigger`, or `tasks.triggerAndPoll` as shown in the examples.
Applied to files:
internal-packages/run-engine/src/engine/systems/runAttemptSystem.ts
📚 Learning: 2025-07-18T17:50:25.014Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-07-18T17:50:25.014Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : When using retry, queue, machine, or maxDuration options, configure them as shown in the examples for Trigger.dev tasks.
Applied to files:
internal-packages/run-engine/src/engine/systems/runAttemptSystem.ts
📚 Learning: 2025-07-18T17:50:25.014Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-07-18T17:50:25.014Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : When using Realtime features, use the `runs.subscribeToRun`, `runs.subscribeToRunsWithTag`, and `runs.subscribeToBatch` APIs as shown.
Applied to files:
internal-packages/run-engine/src/engine/systems/runAttemptSystem.ts
🧬 Code Graph Analysis (1)
internal-packages/run-engine/src/engine/systems/runAttemptSystem.ts (2)
internal-packages/run-engine/src/engine/statuses.ts (3)
isFinishedOrPendingFinished
(34-37)isExecuting
(8-14)isPendingExecuting
(16-19)internal-packages/run-engine/src/engine/errors.ts (1)
ServiceValidationError
(59-67)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (23)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
- GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
- GitHub Check: typecheck / typecheck
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (1)
internal-packages/run-engine/src/engine/systems/runAttemptSystem.ts (1)
44-49
: Importing status helpers is correct and consistentAll imported helpers are used below (no unused imports) and align with Run Engine 2.0 usage.
ef9ee5a
to
86ba734
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
packages/cli-v3/src/entryPoints/managed/execution.ts (1)
1045-1063
: Consider standardizing shutdown reason tokens (optional)Reason strings are a mix of spaced and kebab-case values. Normalizing (e.g., kebab-case) and/or using a typed string union/constant map can prevent typos and make log querying simpler. Non-blocking.
Example (local to this module):
const ShutdownReasons = { Kill: "kill", Shutdown: "shutdown", Abort: "abortExecution", AlreadyFinished: "already-finished", ReQueued: "re-queued", Suspended: "suspended", Deprecated: "deprecated-execution", } as const; type ShutdownReason = typeof ShutdownReasons[keyof typeof ShutdownReasons]; // usage: // this.shutdownExecution(ShutdownReasons.AlreadyFinished)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
internal-packages/run-engine/src/engine/systems/runAttemptSystem.ts
(3 hunks)packages/cli-v3/src/entryPoints/managed/execution.ts
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- internal-packages/run-engine/src/engine/systems/runAttemptSystem.ts
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{ts,tsx}
📄 CodeRabbit Inference Engine (.github/copilot-instructions.md)
**/*.{ts,tsx}
: Always prefer using isomorphic code like fetch, ReadableStream, etc. instead of Node.js specific code
For TypeScript, we usually use types over interfaces
Avoid enums
No default exports, use function declarations
Files:
packages/cli-v3/src/entryPoints/managed/execution.ts
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (23)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
- GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
- GitHub Check: typecheck / typecheck
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (1)
packages/cli-v3/src/entryPoints/managed/execution.ts (1)
303-305
: Accurate shutdown reason for FINISHED — LGTMSwitching the FINISHED path to use reason "already-finished" is correct and improves log clarity without changing behavior.
The old behaviour was wrong for pending executing in the test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (6)
internal-packages/run-engine/src/engine/tests/cancelling.test.ts (6)
153-153
: Fix comment to reference the correct API name.Use the actual method name to avoid confusion.
- // call completeAttempt manually (this will happen from the worker) + // call completeRunAttempt manually (this will happen from the worker)
77-83
: Reduce flakiness: avoid fixed sleeps before dequeues; poll until items are available.Using a fixed 500ms delay can be racy under load. Poll for availability instead of sleeping once.
- //dequeue the run - await setTimeout(500); - const dequeued = await engine.dequeueFromWorkerQueue({ - consumerId: "test_12345", - workerQueue: "main", - }); + //dequeue the run + const dequeued = await waitForDequeue(engine, "main", "test_12345"); ... - //dequeue the child run - await setTimeout(500); - const dequeuedChild = await engine.dequeueFromWorkerQueue({ - consumerId: "test_12345", - workerQueue: "main", - }); + //dequeue the child run + const dequeuedChild = await waitForDequeue(engine, "main", "test_12345");Add this helper near the top of the file (outside the test cases):
async function waitForDequeue( engine: RunEngine, workerQueue: string, consumerId: string, timeoutMs = 5000, intervalMs = 50 ) { const start = Date.now(); while (Date.now() - start < timeoutMs) { const items = await engine.dequeueFromWorkerQueue({ consumerId, workerQueue }); if (items.length > 0) return items; await setTimeout(intervalMs); } throw new Error(`Timed out waiting to dequeue from ${workerQueue}`); }Also applies to: 114-120
178-184
: Reduce flakiness: poll for the second worker notification instead of a fixed 200ms sleep.Polling avoids timing issues and makes the test more robust.
- //cancelling children is async, so we need to wait a brief moment - await setTimeout(200); - - //check a worker notification was sent for the running parent - expect(workerNotifications).toHaveLength(2); - expect(workerNotifications[1].run.id).toBe(childRun.id); + // wait until the child cancellation notification arrives + await waitFor(async () => workerNotifications.length >= 2); + expect(workerNotifications[1].run.id).toBe(childRun.id);Add this tiny helper near the top (or reuse an existing one if you have it):
async function waitFor(predicate: () => boolean | Promise<boolean>, timeoutMs = 5000, intervalMs = 50) { const start = Date.now(); while (Date.now() - start < timeoutMs) { if (await predicate()) return; await setTimeout(intervalMs); } throw new Error("Timed out waiting for condition"); }
122-126
: Remove unused variable or assert its state.
childAttempt
is never read. Either drop the binding or assert its expected state.Option A (remove binding):
- const childAttempt = await engine.startRunAttempt({ + await engine.startRunAttempt({ runId: childRun.id, snapshotId: dequeuedChild[0].snapshot.id, });Option B (assert it executed):
const childAttempt = await engine.startRunAttempt({ runId: childRun.id, snapshotId: dequeuedChild[0].snapshot.id, }); + expect(childAttempt.snapshot.executionStatus).toBe("EXECUTING");
154-166
: Either usecompleteResult
or avoid assigning it.Currently
completeResult
is never read. Prefer asserting the immediate outcome to strengthen the test.const completeResult = await engine.completeRunAttempt({ runId: parentRun.id, snapshotId: executionData!.snapshot.id, completion: { ok: false, id: executionData!.run.id, error: { type: "INTERNAL_ERROR" as const, code: "TASK_RUN_CANCELLED" as const, }, }, }); + expect(completeResult.snapshot.executionStatus).toBe("FINISHED");
292-303
: Assert negative case: no worker notification for non-executing run.This PR ensures we notify workers for executing or pending-executing runs. For a run that hasn't started executing, assert that no notification is sent.
Add a listener before cancellation:
let cancelledEventData: EventBusEventArgs<"runCancelled">[0][] = []; engine.eventBus.on("runCancelled", (result) => { cancelledEventData.push(result); }); + + const workerNotifications: EventBusEventArgs<"workerNotification">[0][] = []; + engine.eventBus.on("workerNotification", (evt) => { + workerNotifications.push(evt); + }); //cancel the parent run const result = await engine.cancelRun({ runId: parentRun.id, completedAt: new Date(), reason: "Cancelled by the user", }); expect(result.snapshot.executionStatus).toBe("FINISHED"); + expect(workerNotifications).toHaveLength(0);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
internal-packages/run-engine/src/engine/tests/cancelling.test.ts
(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{ts,tsx}
📄 CodeRabbit Inference Engine (.github/copilot-instructions.md)
**/*.{ts,tsx}
: Always prefer using isomorphic code like fetch, ReadableStream, etc. instead of Node.js specific code
For TypeScript, we usually use types over interfaces
Avoid enums
No default exports, use function declarations
Files:
internal-packages/run-engine/src/engine/tests/cancelling.test.ts
**/*.test.{ts,tsx}
📄 CodeRabbit Inference Engine (.github/copilot-instructions.md)
Our tests are all vitest
Files:
internal-packages/run-engine/src/engine/tests/cancelling.test.ts
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (23)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
- GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
- GitHub Check: typecheck / typecheck
- GitHub Check: Analyze (javascript-typescript)
No description provided.